15.2 Data structures
// 2024-05-04-wk15-02-data-structures.md
A data structure is a data representation and a list of operations that are possible for that particular data structure (way of organizing data).
All of the Python data types are powerful data structures, under the hood.
int
egersfloat
ing point numbersstr
ingslist
stuple
sdict
ionariesset
s
For each of these data types (and any Python object) you can learn what they can do by checking out their attributes and methods.
Even the lowely int
data type has some interesting methods that you have been using without even knowing it.
In this module 15.2 you will learn about a few different data structures that will come in handy for your future Python work. And you will be able to use one of the data structures that you learn about to make your data-driven text adventure a lot easier to understand.
The int
data structure
Because it is so simple, you probably have never thought about an int
eger as a data structure.
But there is a lot of magic going on under the hood within the Python int
data type.
>>> one = 1
>>> negative_one = -1
>>> zero = one + negative_one
>>> print(zero)
0
>>> dir(one)
['__abs__', '__add__', '__and__', ...]
>>> dir(3)
['__abs__', '__add__', '__and__', ...]
If you haven’t seen a function or variable name like __abs__
before, it may look a little strange to put a double underscore at the beginning of a function or variable.
These are “hidden” methods.
Python hides them from you, but because Python is open source, you can always “pop the hood” to see how great programmers designed these data structures using the built-in dir
function.
Or you can just print out the __dict__
attribute that is available on most objects in Python.
At work you may hear people call these “dunder methods.” When you’re doing a podcast or trying to tell someone what to type on their keyboard “dunder” is a lot easier to say that “double underscore.”
Can you guess what the int.__abs__()
operation does?
>>> one.__abs__()
1
>>> negative_one
-1
>>> negative_one.__abs__()
1
What about the other two dunder methods, __add__
and __and__
?
>>> one.__add__(negative_one)
0
>>> one + negative_one
0
The __add__
method is what is called a “binary operator.”
It needs two values to do it’s work.
The __abs__
method works with only one value and is called a “unary operator.”
What about the __and__
method.
Is that a binary or a unary operator?
Where have you seen the word and
in Python before?
>>> bool(one)
True
>>> bool(zero)
False
>>> one and zero
False
>>> one.__and__(zero)
False
These are just the first 3 of the 72 different operations built into a Python int
data type.
To learn more about these basic data structures, use your IDE (Spyder) to find out how many attributes and methods an dict
ionary data type has.
And see if you can run one of the methods within a dict
data structure just like you did for the int
data structure.
More powerful data structures
You may not think of Python data type
s when you hear the term “data structure”.
However, in the workplace when people use the term “data structure” they mean an organized way of collecting and maniuplating values collections of data values (objects).
A sorted list of integers stored in an array is a common data structure.
And the real power of data strucutures becomes aparent once you start nesting simpler container data types, such as list
s and dict
s within each other.
In this section you will learn a bit more about the “graph” data structure, and you can implement it using a dictionary of dictionaries.
Data types and data structures that can be used to hold other data objects are called “containers.” Containers can even contain themselves or other containers, creating a nested data structure. So a data structure is how you use container data types to organize your data so that you can process it for whatever problem you want to solve. The way you design your data structure can make your code extremely complicated or very simple.
Data structures are the fundamental building blocks of any program. A computer science degree usually invoves several courses in data structures and database design. Some of the data structures you will learn about have whole courses and have become whole industries, all by themself. Here are some data structures you have already seen, and some new ones that you may want to use for you future Python data structures.
- array —
list
- linked list — a
list
where you can only do a “sequential scan” in the order of the list and cannot skip around (cannot use “random access”) - table —
list
oflist
s orlist
ofdict
s - relational databases — tables connected by relationships
- mapping —
dict
- directed graph —
dict
ofdicts
or connection matrix (list
oflist
s) or edge list (list
of 2-tuple
s) or adjacency list - undirected graph
- directed acyclic graph
- tree — a directed acyclic graph where every child node has only one parent (or every worker node has only one boss)
In the next section you will learn about one special data structure called a graph. This is the data structure I use when building chatbots for customers around the world.
Graph data structure
In computer science, a graph is a data structure containing objects connected to each other in a web or network of relationships. It is also sometimes called a network data structure. For example, the social graph data structure at Facebook (Meta) contains all the users and their connections to each other through “friend” relationships.
On Facebook, the friend relationships are mutual.